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Abstract 

We present MSC^^ (M denotes Markov logic net¬ 
works) an extension of the log-linear description log¬ 
ics with concrete domains, nominals, and in¬ 

stances. We use Markov logic networks (MLNs) in or¬ 
der to find the most probable, classified and coherent 
ontology from an knowledge base. In 

particular, we develop a novel way to deal with concrete 
domains (also known as datatypes) by extending MLN’s 
cutting plane inference (CPI) algorithm. 

Introduction 

In description logics (DLs) a concrete domain is a construct 
that can be used to define new classes by specifying restric¬ 
tions on attributes that have literal values (as opposed to re¬ 
lationships to other concepts). Practical applications of DLs 
usually require concrete properties with values from a fixed 
domain, such as strings or integers, supporting built-in pred¬ 
icates. For DLs that are extended with concrete domains, 
there exist partial functions mapping objects of the abstract 
domain to values of the concrete domain, and can be used for 
building complex concepts. Concrete domains can be used 
to construct complex concepts as for instance, the axiom 
Teenager = Person FI 3ape.(>, 13) FI 3age.{<, 19) de¬ 
fines a teenager as a person whose age is at least 13 and 
at most 19. In DLs, concrete domains are also known as 
datatypes. Several probabilistic extensions of DLs opt to ex¬ 
clude datatypes while, in fact, it is an essential feature as 
several knowledge extraction tools produce weighted rules 
or axioms that contain concrete data values. Reasoning over 
these data either to infer new knowledge or to verify cor¬ 
rectness is indispensable. Additionally, recent advances in 
information extraction have paved the way for the auto¬ 
matic construction and growth of large, semantic knowledge 
bases from different sources. However, the very nature of 
these extraction techniques entails that the resulting knowl¬ 
edge bases may contain a significant amount of incorrect, 
incomplete, or even inconsistent (i.e., uncertain) knowledge, 
which makes efficient reasoning and query answering over 
this kind of uncertain data a challenge. To address these is¬ 
sues, there exist ongoing studies on probabilistic knowledge 
bases. 
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The study of extending DLs to handle uncertainty and 
vagueness has gained momentum recently. There have 
been several proposals to add probabilities to various DLs. 
Probabilistic DLs can be classified in several dimensions. 
One possible classification is on the reasoning mecha¬ 
nism used: Markov logic networks (MLNs), Bayesian net¬ 
works, and probabilistic reasoning. There exist some stud¬ 
ies that employ MLNs to extend various DLs. The study in 
(Lukasiewicz et al. 2012) extends with probabilistic 

uncertainty based on the annotation of axioms using MLNs. 
The main focus of this work is ranking queries in descend¬ 
ing order of probability of atomic inferences which is dif¬ 
ferent from the objective of this paper. Another study in 
(Niepert, Noessner, and Stuckenschmidt 2011), presents a 
probabilistic extension of the DL £C^^ without nominals 
and concrete domains in MLN in order to find the most prob¬ 
able coherent ontology. In doing so, they have developed a 
reasoner for probabilistic OWL-EL called ELOG (Noessner 
and Niepert 2011). In this study, we extend this work in or¬ 
der to deal with concrete domains in addition to nominals 
and instances. In databases, MLNs have been used to create 
a probabilistic datalog called Datalog+/—. It is an extension 
of datalog that allows to express ontological axioms by using 
rule-based constraints (Gottlob et al. 2013). The probabilis¬ 
tic extension of Datalog-I-/— uses MLNs as the underlying 
probabilistic semantics. The focus of this work is on scal¬ 
able threshold query answering which is different from that 
of this work. 

Other literatures extend DLs with Bayesian networks. 
Some notable works include: an extension of £C with 
Bayesian networks called BSC is presented in (Ceylan and 
Penaloza 2014). They study the complexity of reasoning 
under BSC to show that reasoning is intractable. How¬ 
ever, their work does not discuss probabilities in the ABox 
and concrete domains are excluded. On the other hand, in 
(d’Amato, Eanizzi, and Lukasiewicz 2008), they added un¬ 
certainty to DL-Lite based on Bayesian networks. Addition¬ 
ally, they have shown that satisfiability test and query an¬ 
swering in probabilistic DL-Lite can be reduced to satisfia¬ 
bility test and query answering in the DL-Lite family. Eur- 
ther, it is proved that satisfiability checking and union of 
conjunctive query answering can be done in LogSpace in 
the data complexity. 

Consequently, as discussed above, most of the studies that 



involve extending description logics to deal with uncertainty 
by using either Bayesian or MLNs often excluded concrete 
domains. This is partly due to either the lack of supporting 
features or the difficulty in dealing with them. In this paper, 
we study a novel way of dealing with uncertainty involving 
concrete domains. Henceforth, we provide an extension to 
£’£^“^-LL with concrete domains, nominals and instances. 


Preliminaries 

In this section, we present a brief summary of: , 

Markov logic networks, cutting plane inference, and 
LL. For a detailed discussion on these subjects, we refer the 
reader to (Baader, Brandt, and Lutz 2005; Richardson and 
Domingos 2006; Riedel 2012; Niepert, Noessner, and Stuck- 
enschmidt 2011) and the references therein. 


ec++ 

is the description logic underlying the OWL 2 profile 
OWL-EL'. 

Syntax Given a set of concept names Nc, role names Nr, 
individuals Ni, and feature names Np, concepts and 

roles are formed according to the following syntax: 

C ::= T I _L I A I C n I 3R.C \ {a} | 3F.r 

A concept in is either a top, bottom concept, an 

atomic concept or a complex concept (formed by conjunc¬ 
tion and existential restriction). Given a datatype restriction 
r = (o, v) and x G D, we say that x satisfies r and write 
r{x) iff (x, v) G o, where o G {<, <, >, >, =}, o is inter¬ 
preted as the standard relation on real numbers, and 2? C R 
is a concrete domain (Despoina, Kazakov, and Horrocks 
2011). In this work, we consider only numerical concrete 
domains and leave out the others for future work. An £C^^ 
TBox contains a set of GCI (General Concept Inclusion) ax¬ 
ioms, i.e., C L £>, as well as role inclusion axioms, i.e., 
Ri o ■ ■ ■ o Ri^ □ i?. 

The semantics of £C^'^ concepts and roles is given by 
an interpretation function I = .^) which consists of a 

non-empty (abstract) domain and a mapping function 
(Baader, Brandt, and Lutz 2005). 

Semantics The semantics of concepts and roles is 

given by an interpretation function X = (A-^, .^) which con¬ 
sists of a non-empty (abstract) domain and a mapping 
that assigns to each atomic concept A G Nq a subset of A^, 
to each abstract role R G Nr a subset of A^ x A^, to each 
concrete relation F G Np a subset of A^ x V, and to each 
individual a G Ni an element of A^. The mapping is 
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extended to all concepts and roles as follows: 


(T)'^ = A^ 

(Tf =0 

({«}f = {a'^} 

(CnDf =c^nD^ 

{3R.C) = {x G A^ \3y € A^ : 

(x, y)GR^ Ay G C^} 

(3F.r)^ = {x G A^ I 3v G V : (x,v) G F^ 
A r{v)} 

(C E Df =C^ CD^ 


(i?i o • • • o C R)^ = R^ o ■ ■ ■ o R^ C R^ 


Knowledge about specific objects can be expressed using 
concept and role assertions of the form C{a) and R{a, b). 
The axioms and assertions are contained in the TBox and 
ABox, respectively, which together form a knowledge base 
(KB). An knowledge base (or ontology) O = (T, A) 
consists of a set T of general concept inclusion axioms 
(TBox) and role inclusion axioms, and possibly a set A of 
assertional axioms (ABox). A concept name C in an ontol¬ 
ogy O, is unsatisfiable iff, for each interpretation I of O, 
= 0. An ontology O is incoherent iff there exists an un¬ 
satisfiable concept name C in O, i.e., C |= _L (Flouris et al. 
2006). 

To simplify the translation of probabilistic EC^^ KB into 
FOL, we first obtain the normal form of the KB in such a 
way that satisfiability is preserved (Baader, Brandt, and Lutz 
2005; Krotzsch 2011). An KB is in normal form if its 
axioms are in the following form: 

C(a) R{a,b) A^A TCC 
AA{c} {a} E {c} Arc AnBCC 
3RA EC AO 3R.B A O 3F.r 3F.r O A 
Ri E F 2 R\ o R 2 E F 


where A,B,C G Nc, i?, i?i, i ?2 G Nr,F G Np, r is a 
datatype restriction, and a,b,c & Nj. 

It is possible to provide a probabilistic extension of E£^^ 
using MLNs. An £C^^ KB can be seen as a set of hard con¬ 
straints on the set of possible interpretations: if an interpreta¬ 
tion violates even one axiom or assertion, it has zero proba¬ 
bility. The basic idea in MLNs is to soften these constraints, 
i.e., when an interpretation violates one axiom or assertion 
in the KB it is less probable, but not impossible. The fewer 
axioms an interpretation violates, the more probable it be¬ 
comes. Each axiom and assertion has an associated weight 
that reflects how strong a constraint is: the higher the weight, 
the greater the difference in log probability between an in¬ 
terpretation that satisfies the axiom and one that does not, 
other things being equal (Richardson and Domingos 2006). 


Markov Logic Networks 

Markov Logic Networks (MLNs) combine Markov net¬ 
works and first-order logic (FOL) by attaching weights 
to first-order formulas and viewing these as templates for 



features of Markov networks (Richardson and Domingos 
2006). An MNL L is a set of pairs {Fi,Wi) where Fi is 
a formula in FOL and Wi is a real number representing a 
weight. Together with a finite set of constants C, it defines 
a Markov Network where contains one node 

for each possible grounding of each predicate appearing in 
L. The value of the node is 1 if the ground predicate is true, 
and 0 otherwise. The probability distribution over possible 
worlds X specified by the ground Markov network is 

given by: 

1 F 

P{X = x) = ^exp( ^ w^n^{x)) 

i=l 

where F is the number of formulas in the MLN and rii {x) is 
the number of true groundings of Fi in x. The groundings of 
a formula are formed simply by replacing its variables with 
constants in all possible ways. The Herbrand Universe H 
for an MLN L is the set of all terms that can be constructed 
from the constants in L. The Herbrand Base HB is often 
defined as the set of all ground predicates (atoms) that can be 
constructed using the predicates in L and the terms in H. In 
this paper we focus on MLNs whose formulas are function- 
free clauses. 

In order to compute a maximum a-posteriori (MAP) state 
of an MLN, we formulate the problem as an integer linear 
program (ILP) using the cutting plane inference algorithm. 

Cutting Plane Inference (CPI) 

A MAP query corresponds to an optimization problem with 
linear constraints and a linear objective function. Hence, it 
can be formulated and solved as an instance of an integer 
linear program (ILP). (Riedel 2012; Noessner, Niepert, and 
Stuckenschmidt 2013) introduced cutting plane inference as 
a meta algorithm that transforms an MLN into ILP. The ba¬ 
sic idea of CPI is to add all constraints to the ILP that violate 
the current intermediate solution. This process is repeated 
until no (additional) violated ground clauses exist. An ILP 
solver resolves the conflicts by computing an optimal truth 
assignment for an MLN. Hence, the solution of the final ILP 
corresponds to the MAP state. It is necessary to execute sev¬ 
eral iterations as the intermediate solution changes after each 
iteration and more violated clauses might be detected. At the 
beginning of each CPI iteration it is necessary to determine 
the violated ground clauses Q that are specified by the MLN 
and are in conflict with the intermediate solution. A binary 
ILP variable xi £ {0,1} gets assigned to each grounded 
predicate occurring in a violated clause g € Q. The value of 
the the variable xe is 1 if the respective literal i is true and 
0 when it is false. These variables are used to generate ILP 
constraints that are added to the ILP for each violated ground 
clause. For each clause g G Q, we define L^f^g) as the set 
of ground atoms that occur unnegated in g and L~ {g) as the 
set of ground atoms that occur negated in g. The transforma¬ 
tion scheme depends on the weight Wg £ R of the violated 
clause g. It is also necessary to create a binary variable Zg 
for every g with Wg ^ oo that is used in the objective of the 
ILP. For every ground clause g with Wg > 0, the following 


constraint has to be added to the ILP. 

(g) 

A ground atom i that is set to false (true if it appears negated) 
by evidence will not be included in the ILP as it cannot fulfil 
the respective constraint. For every g with weight Wg < 0, 
we add the following constraint to the ILP: 

^ ^ {1-xe) < {\L+{g)\ + \L-{g)\)zg 

t^L^{g) {g) 

The variable Zg expresses if a ground formula g is true con¬ 
sidering the optimal solution of the ILP. However, for every 
g with weight Wg = oo this variable can be replaced with 1 
as the respective formula cannot be violated in any solution: 

(1 -Xl) >1 

ieL+{g) ieL-{g) 

Finally, the objective of the ILP sums up the weights of 
the (satisfied) ground formulas: 

max WgZg 
geG 

The MAP state corresponds to the solution of the ILP in 
the last CPI iteration. It can be directly obtained from the 
solution as the assignment of the variables xe can be directly 
mapped to the optimal truth values for the ground predicates, 
i.e., Xi = true if the corresponding ILP variable is 1 and 
Xi = false otherwise. The MAP state of an 
TBox can be computed by a reduction into CPI. 

££++-LL 

5£^~'"-LL (Log-linear EC^^) is a probabilistic extension 
of EZ^^ without nominals, instances and concrete do¬ 
mains (Niepert, Noessner, and Stuckenschmidt 2011). Each 
£^£’*’'*'-LL TBox axiom is either deterministic (i.e., axioms 
that are known to be true) or uncertain (i.e., axioms that 
have a degree of confidence). The uncertain axioms have 
associated weight. Formally, a EC^^ -UF TBox is given 
by T = {T^,T^), where and is a set of pairs 
of {S,ws) where S is an axiom and ws is its real-valued 
weight, denote deterministic and uncertain axioms respec¬ 
tively. 

The semantics of an EC^^-LL TBox is given by a joint 
probability distribution over a coherent EC^^ TBox. Given 
TBoxes T = and F' over the same vocabulary, 

the probability of F' is given by: 

4exp( X) 

ptr') = i ^ V{v(s,^s)ert^:r'hs} / 

^ ’ if r' h A r' ^ -L 

0 otherwise 

V 

In order to generate the most probable, coherent and 
classified TBox using MLN, EC^^ completion rules and 
EC^^-UL TBox axioms are translated into FOL formulae. 
In the following, we show how to extend EC^^-LL with 
nominals, instances, and concrete domains. 



Extending £;C"''"'‘-LL with Nominals, 
Instances and Concrete Domains 

In (Niepert, Noessner, and Stuckenschmidt 2011), the au¬ 
thors claim that their approach is extensible to the Horn frag¬ 
ments of DLs (look (Krotzsch 2011) for instance). To take 
advantage of this claim, we extend with proba¬ 

bilistic knowledge expressed through nominals, individuals, 
and concrete domains. The syntax of this extension (that we 
call is the same as that of £C^^-LL, basically, 

it is the syntax of £C^^ with weights attached to each un¬ 
certain axiom and assertion. An M.£C^^ KB has two com¬ 
ponents: deterministic KB^ and uncertain knowledge 
bases. In order to provide semantics, we assume that KB^ is 
coherent. The semantics of coherent M£C^^ KBs is given 
by a probability distribution as defined below. 

Definition 1 Given an M£C^^ knowledge base KB = 
(KB^,KB^) over a vocabulary o/Nc, Nr, Np, and Ni, 
the semantics of a coherent KB^ = (KBf, KB^) over the 
same vocabulary is given by a probability distribution: 

-^expf x; wA 

if KBi \= KB^ A KB, ^ _L 
0 otherwise 


can be done by converting the completion rule CR6 (Baader, 
Brandt, and Lutz 2005) into FOL and enforcing that each 
nominal Oi G Nj is distinct. Alternatively, unique name as¬ 
sumption for individuals names can be enforced by using the 
axiom {a} H {6} □ _L for all relevant individual names a and 
b. In addition, the transformation of TBox completion rules 
into FOL in MNL is given in Table 1. 

By using nominals, instance knowledge can be added to 
an ABox. 

ABox 

Since the description logic £C^^ is equipped with nomi¬ 
nals. ABox knowledge can be converted into TBox axioms. 
Thus, with nominals, ABox becomes syntactic sugar: 

C{a) {a} L C, R{a, h) O {a} L 3i?.{6} 

Instance checking in turn is directly reducible to subsump¬ 
tion checking in the presence of nominals. There exist two 
ways to represent uncertain ABox assertions, i.e., C{a) and 
R{a,b),m MLN: 

i. transform ABox assertions into TBox axioms using nom¬ 
inals as follows: 

{C{a),wi) {{a} C C,wi) 

{R{a,b),W2) {{a} C 3R.{b},W2) 


P(KB') = 


Example 1 Consider an M£C^"^ KB = (KB^,KB^).- 
KB^ = { Toddler FI Adult C _L}, 

KB^ = {{Toddler C 3a5e.(<,3), 0.8), 

{3age.{<,3) L Person, 0.7), 

{Toddler G Adult, 0.1), {age{john,2), 0.7)} 


The probabilities of the axioms and assertions can be com¬ 
puted as follows: 

P{{Toddler L 3ape.(<,3)}) = iexp(0.8) 

Zj 

P{{Toddler C Adult}) = 0 


P 


( [Toddler C 3age.{<,3),age{john,2), 


3age.{<,3) L Person} j = —exp(2.2) 

P{{}) = |exp(0) 

P{{Toddler 13 Adult C _L}) = 1 
Z = exp(0.8) 3- exp(2.2) 3- exp(0.7) 3- exp(O) 


In order to derive the most probable, classified and co¬ 
herent £C^'^ ontology from an A4££^'^ KB, we trans¬ 
form the KB, TBox completions rules (Baader, Brandt, and 
Lutz 2005), concrete domains, and ABox completion rules 
(Krotzsch 2011) into FOL formulae. 


Nominals 

(Un)certain axioms that contain nominals can be translated 
into FOL in MNL by using Definition 2. Inference in MNL 


iii. introduce two new predicates for each instance type as: 

{C{a),wi) ^ inst{a,C) wi 
{R{a,b),W 2 ) ^ rinst{a,R,b) W 2 

This approach requires transforming ABox completion 
rules into FOL, so as to generate classified ontologies. 

In this paper, we consider the second approach (ii)^. Next, 
we show how concrete domains are translated into the MLN 
framework. 


Concrete Domains 

Reasoning over uncertain concrete domains can be done by 
transforming the datatype predicates in the axioms and as¬ 
sertions into mixed integer programming as shown in (Strac- 
cia 2012). However, in this work, we introduce an efficient 
approach that transforms the predicates into a test function 
that evaluates to true or false based on the grounding gen¬ 
erated by an extension of the CPI algorithm. Inference in¬ 
volving axioms that contain concrete domains can be done 
according to the deduction rules given below: 


A\— B B 1— 3F.{o,v) 

A C 3F.{o,v) 

AC3P.(oi,ui) 3F.{o2,V2)GB 
AC B 


eval{oi,vi,02,V2) 


3F.{o,Vi)GA F{a,V 2 ) 


A{a) 

A{a) An3F.{=,v) 
F{a,v) 


eval{o, vi,=, V 2 ) 


^We leave a comparison of the two approaches as a future work. 







Fi - Fg Refer to Table 2 in (Niepert, Noessner, and Stuckenschmidt 2011). 

Fio Vc, d,a,r : subNom(c, a) A subNom((i, a) A rsup(c, r, d) —?> sub(c, d) 

Fii \fc, d, a,r,b : subNom(c, a) A subNom((i, a) A rsupNom( 6 , r, d) —?> sub(c, d) 

Fi 2 Vc, d, f,o,v : sub(c, d) A rsupEx(d, /, o, v) => rsupEx(c, /, o, v) 

Fig Vc, d, f,o,v : rsupEx(c, f, oi,vi) A rsubEx(/, 02 ,V 2 , d) A eval(oi, wi, 02 , ^ 2 ) ^ sub(c, d) 

Table 1: TBox completion rules. 

Fi 4 Vx, a, B : inst(x, A) A sub(A, B) inst(a:, B) 

Fig Vx, Ai,A 2 , B : inst(a;, Ai) A inst(x, A 2 ) A int(Ai, A 2 , B) ^ inst(x, B) 

Fiq Vx, y, R,A,B : rinst(a;, R, y) A inst( 2 /, A) A rsub( 2 l, R, B) ^ inst(x, B) 

Fn Vx, y,R,S : rinst(a::, R, y) A psub(i?, S) => rinst(a;, R, y) 

Fig Vx, y, z, Ri, R 2 , Rg : rinst(a;, Ri,y) A rinst(y, R 2 ,z) A pcomp(i?i, i? 2 , f? 3 ) rinst(a:, Rg, z) 
Fig Vx, a, B : ninst(a;, a) A inst(a:, B) =J> inst(a, B) 

F 20 Vx, a, B : ninst(a;, a) A inst(a, B) =J> inst(a:, B) 

F 21 Vx, a,z,R : ninst(a:, a) A rinst(z, R, x) => rinst(z, R, a) 

F 22 Vx, A, B : sub(T, A) A inst(x, B) ^ inst(x, A) 

F 23 Vx, x', R,A,B ; inst(a;, a) A rsup(A, R, B) ^ rinst(a;, R, x') 

F 24 Vx, x', R,A,B : inst(a;, a) A rsup(A, R, B) ^ inst(x', B) 

F 25 Vf, op, V, C : rsupEx(/, op, v, C) A rinst(a, f, v') A eval(u, op, v') ^ inst(a, A) 

F 26 Va, A,f,v : inst(a, A) A rsubEx( 2 l, f, =, v) ^ rinst(a, f, v) 

F 27 Va, Ai, A 2 , f,v : inst(a, Ai) A inst(a, A 2 ) A intEx( 2 li, A 2 , f, op, v) rinst(a, f, v) 

Table 2; ABox completion rules. 


where eval{...) checks if all possible values of the first 
operator-value pair (oi ,vi) are covered by the possible val¬ 
ues of the second operator-value pair ( 02 ,^ 2 ), when so, it 
evaluates to true otherwise false. The function eval{.. .) is 
defined based on a datatype V, i.e., N or Z or R, and al¬ 
gebraic operators. Some of the algebraic comparisons, com¬ 
puted via eval{. ..), that are useful to determine inference 
are listed below: 


eval{<. 

Vl 

<,V2) 

= Vl 

< 

V2 

eval{<. 

Vi 

<,V2) 

= Vl 

< 

V2 

eval{=. 

Vl 

<,V2) 

= Vl 

< 

V2 

eval{=. 

Vl 

<,V2) 

= Vl 

< 

V2 

eval{=. 

Vl 

= ,V2) 

= Vl 

= 

V2 

eval{=. 

Vl 

>,V2) 

= Vl 

> 

V2 

eval{=. 

Vl 

>,V2) 

= Vl 

> 

V2 

eval{>. 

Vl 

>,V2) 

= Vl 

> 

V2 

eval{>. 

Vl 

>,V2) 

= Vl 

> 

V2 

eval{>. 

Vl 

y 

to 

= Vl 

> 

V2 


This function is computed on-demand after each CPI itera¬ 
tion as discussed in the next section. The translation of the 
deduction rules into FOL is given in Table 1 and Table 2. 

Example 2 Consider an KB = {{2Y ear Old C 


3age.(=, 2), 0.7), (3age.(<, 3) C Toddler, 0.8)} that con¬ 
tains axioms expressed using concrete domains. From the 
KB, the axiom 2YearOld C Toddler can be inferred since 
eval{oi, vi, 02 ,V 2 ) is true, i.e., eval(=, 2, <, 3) := 2 < 3. 

So far we have discussed how axioms and assertions can be 
translated into FOL. Next, we show how the most probable 
KB is derived using MAP inference. 

Computing a Most Probable KB 

To derive the most probable classified and coherent ontology 
from a weighted KB, we proceed by transforming 

TBox and ABox completion rules, schema axioms, and as¬ 
sertions into function-free FOL formulae. The formulae cor¬ 
responding to the translation of completion rules into FOL 
are shown in Table 1 and Table 2. The formulae from Fi 
through Fg are taken from (Niepert, Noessner, and Stuck¬ 
enschmidt 2011). Additionally, a bijective mapping function 
is provided in Definition 2 to transform axioms and asser¬ 
tions into formulae. Of particular interest for us is proposing 
a novel way to deal with concrete domains under MLN by 
modifying the Cutting Plane Inference (CPI) algorithm. 

In it is possible to build incoherent TBox axioms 

due to the presence of the bottom concept _L, for instance, 
consider the axiom {a} C _L, this cannot be satisfied by 
any interpretation. To filter out such incoherencies in models 





generated by MLN, we include the formula Vc : -^sub{c, _L) 
(formula Fg in Table 1) to the translation of the comple¬ 
tion rules into FOL. This technique has already been used in 
(Niepert, Noessner, and Stuckenschmidt 2011). 

Definition 2 [Mapping KB into Ground FOL 

predicates] The function ip translates a normalized 
knowledge base KB into FOL formulae in MLN 
as follows: 

C(a) 1 -^ inst(a, C) 

R{a, b) 1 -^- rinst(a, R, b) 

AG_L^ sub(A, ±) 
sub(T, C) 

A C {c} !-)• subNom(A, {c}) 

{a} C {c} sub({a},{c}) 

AGC^ sub(A, C) 

AnB GC ^\TLt{A,B,C) 

3R.A C C rsub(^, R, C) 

A Cl rsup(^, R, B) 

A C 3F.(o, v) rsupEx(^, F, o, v) 

3F.{o,v) E A !-)• rsubEx(F, o, u, A) 
i?l C i ?2 psub(i?i, i? 2 ) 
i?i o i ?2 C i? i-^- pcom(i?i, i? 2 , R) 

int{{ai}, {cij}, _L) where Oi, aj € Ni and i f j 

where a,b,c € Ni, A,B,C € Nc, G Nr, F € 

Np, o € {<, <, >, >, =}, and v S M (set of real numbers). 

Lemma 1 The translation of an KB into FOL and 

vice versa can be done in polynomial time in the size of the 
knowledge base (Lukasiewicz et al. 2012). 

From the above Lemma, we see that the translation of 
A4£C^^ KB completion rules, axioms, and assertions into 
FOL in MLN does not affect the complexity of inference in 
MLN. Besides, as typed variables and constants greatly re¬ 
duce size of ground Markov nets. We introduce types to all 
of the predicates shown in Tables 1 and Table 2. 

Theorem 1 Given an N4 £jC^'^ ontology KB = (T, A) and 
KB^ C KB, a Herbrand interpretation Ti is a model o/KB^ 
i.e., Ti \= KB^ if and only if there exist a mapping function 
p such that p{Ti) |= KB^ 

So far we have introduced a mapping function p for 
KB assertions and axioms and completion rules as formu¬ 
lae (Fi-Fgr). The next step requires using MAP inference 
of MLN to obtain the most probable ontology of a given 
M£C++ KB. 

Maximum A-Posteriori Inference (MAP) 

In order to deal with M££'^^ datatypes, we introduced a 
predicate called eval{...) in the translation of com¬ 

pletion rules into FOL, depicted in Table 1 and Table 2. The 
truth value of eval{. ..) is computed by evaluating the log¬ 
ical expressions corresponding to datatypes in an Aff 
KB. For instance, consider the eval{...) predicate in Ex¬ 
ample 2. In the following, we show how the expression 


(=, 2) C (<, 3), operator-value pair coverage, i.e., is eval¬ 
uated by extending the CPI algorithm. Thus, we propose 
an extension of CPI by incorporating algebraic expressions. 
In particular, our extension addresses a limitation of MLN 
with respect to concrete domains. In general, all (numeri¬ 
cal) values are represented as constants in MLN. The only 
semantics that are related to constants might be the type 
to which they belong. This enables more efficient ground¬ 
ing and leads to smaller MLNs. However, this does hardly 
cover the characteristics of numerical values. Therefore, we 
exploit the iterative character of CPI in order to evaluate 
numerical (in)equalities. The extension can be considered 
as additional features that are only used on-demand. It is 
formula-specific as it affects the ground values and the truth 
value of specific constraints. Hence, it can be implemented 
as an extension of the detection of the violated constraints. 

The algorithm identifies at the beginning of each CPI it¬ 
eration for each formula all violated groundings consider¬ 
ing the current intermediate solution. Each of the violated 
ground clauses has to be translated and added to the ILP. 
Therefore, an ILP variable is generated for each ground 
predicate as well as additional ILP constraints. Datatype 
ground predicates eval{...) appear during this process as 
any other predicates. However, we exploit there semantics to 
decide whether eval{...) predicates evaluate to true or false. 
Depending on the result of the evaluation of the attached 
boolean expression of the respective predicate, we decide 
whether it is necessary to add the violated ground clause to 
the ILP. Eor instance, if the datatype predicate is positive 
(negative) and it appears without negation (or negation) in 
the formula, we do not add the ground clause to the ILP 
as it is not violated in the current iteration. Otherwise, we 
need to add the clause to the ILP but leave out the datatype 
ground predicates as they can not fulfil the violated clause, 
i.e., the respective literal is false due to evidence. Hence, 
we do not introduce ILP variables for datatype predicates as 
they will not be added to the ILP. Instead, we compute the 
truth value of the datatype predicates on-the-fly and only on- 
demand. Hence, the proposed approach improves the effi¬ 
ciency of processing numerical predicates in a Markov logic 
solver without sacrificing the performance. We implemented 
this algorithm as an extension to the MLN inference engine 
ROCKIT^ (Noessner, Niepert, and Stuckenschmidt 2013). 
We leave out testing this implementation with different on¬ 
tologies as a future work. 

Theorem 2 Given the following: 

• an MBC^'^ knowledge base KB = (KB^,KB^) 

formed from a vocabulary containing a finite set of in¬ 
dividuals Nr concepts Nq, features Np, and roles Nr, 

• HB as a Herbrand base of the formulae F in Table 1 and 
Table 2 over the same vocabulary, 

• Gi as a set of ground formulae constructed from KB^, 
and 

• Gg as a set of ground formulae constructed from KB^, 


^https://code.google.com/p/rockit/ 



the most probable coherent and classified ontology is ob¬ 
tained with: 

ip-^[i)= argmax ( ^ wA 

HBD/tGiUFV, ri_ / 

From Theorem 2 and the results in (Roth 1996), finding 
the most probable, classified and coherent KB is 

in NR The hardness of this complexity bound can be ob¬ 
tained by reducing partial weighted MaxSAT problem into 
an MAP query. Consequently, the MAP problem 

for is NP-hard. 

Conclusion 

In this work, we have extended into M£C^^ 

with nominals, concrete domains and instances. In partic¬ 
ular, we proposed an extension to the CPI algorithm in or¬ 
der to deal with reasoning under uncertain concrete domains. 
We have implemented the proposed approach and planned to 
carry out experiments in the future. We will also investigate 
to extend the proposed approach to other datatypes such as 
Date, Time, and so on. 
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